Containment of Relational Queries with Annotation Propagation
نویسنده
چکیده
We study the problem of determining whether a query is contained in another when queries can carry along annotations from source data. We say that a query is annotation-contained in another if the annotated output of the former is contained in the latter on every possible annotated input databases. We study the relationship between query containment and annotation-containment and show that annotation-containment is a more refined notion in general. As a consequence, the usual equivalences used by a typical query optimizer may no longer hold when queries can carry along annotations from the source to the output. Despite this, we show that the same annotated result is obtained whether intermediate constructs of a query are evaluated with set or bag semantics. We also give a necessary and sufficient condition, via homomorphisms, that checks whether a query is annotationcontained in another. Even though our characterization suggests that annotation-containment is more complex than query containment, we show that the annotation-containment problem is NP-complete, thus putting it in the same complexity class as query containment. In addition, we show that the annotation placement problem, which was first shown to be NP-hard in [BKT02], is in fact DP-hard and the exact complexity of this problem still remains open. ∗This research is supported by faculty research funds granted by the University of California, Santa Cruz.
منابع مشابه
A Calculus for Propagating Semantic Annotations Through Scientific Workflow Queries
Scientific workflows facilitate automation, reuse, and reproducibility of scientific data management and analysis tasks. Scientific workflows are often modeled as dataflow networks, chaining together processing components (called actors) that query, transform, analyse, and visualize scientific datasets. Semantic annotations relate data and actor schemas with conceptual information from a shared...
متن کاملQuery Containment of Tier-2 Queries over a Probabilistic Database
We study the containment problem for a query language over probabilistic relational databases that allows queries like “is the probability that q1 holds greater than 0.2 and the probability that q2 holds greater than 0.6?” where q1 and q2 are Boolean conjunctive queries. In addition to being a fundamental problem in its own right, the containment problem is the key problem that an optimizer mus...
متن کاملThe complexity of higher-order queries
Higher-order transformations are ubiquitous within data management. In relational databases, higher-order queries appear in numerous aspects including query rewriting and query specification. This work investigates languages that combine higher-order transformations with ordinary relational database query languages. We study the two most basic computational problems associated with these query ...
متن کاملEquivalence of DATALOG Queries is Undecidable
D Datalog is a powerful query language for relational databases [lo]. We consider the problems of determining containment, equivalence, and satisfiability of Datalog queries. We show that containment and equivalence are recursively unsolvable. This should be contrasted with the work of Aho, Sagiv, and Ullman on relational queries [l]. Satisfiability is easily decidable for Datalog queries. We a...
متن کاملQuery processing concepts and techniques for set containment tests
Relational division is an operator of the relational algebra that realizes universal quantifications in queries against a relational database. Expressing a universal quantification problem in SQL is cumbersome. If the division operator would have a counterpart in a query language, a more intuitive formulation of universal quantification problems would be possible. Although division is a derived...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003